#Local LLM

32 articles

TechMar 18, 20264 min

Holotron-12B Makes PC-Operation AI 1.7× Faster, and Unsloth Studio Lets You Tune Models Without Code

H Company's Holotron-12B uses a memory-efficient new design to lift PC-operation AI throughput to 8,900 tokens per second. Unsloth has released the beta of 'Studio,' a browser tool for no-code model fine-tuning.

AI LLM AI Agents Unsloth Local LLM

TechMar 1, 202611 min

The Reason Qwen 3.5 Failed on Radeon 8060S Was an Outdated AMD Driver

Isolating the cause of Qwen 3.5 failing on ROCm/Vulkan via CPU inference, llama-server, and LM Studio — an AMD driver update resolved everything.

AI LLM Local LLM AMD llama.cpp Ollama LM Studio Experiment

TechFeb 28, 2026updated12 min

Qwen 3.5 abliterated in Ollama: broken outputs, chat-template failures, and the official-model workaround

Hands-on test of huihui-ai Qwen 3.5 abliterated models in Ollama: garbage-token failures, GLM-4.7-Flash chat-template breakage, and why the official model with thinking disabled worked better.

AI LLM Ollama Local LLM AMD LM Studio Vulkan ROCm Experiment

TechFeb 27, 20269 min

Building a Local OCR Hot Folder for Confidential Documents with ScanSnap + NDLOCR-Lite

A folder-watching script that automatically OCRs images scanned by ScanSnap and runs LLM-based correction. Includes air-gap security design.

OCR NDLOCR-Lite ScanSnap Python Mac Local LLM Experiment

TechFeb 27, 20268 min

Three Months with NDLOCR: The Full Journey and What Others Built

From Docker hell to Lite + LLM correction. A retrospective on my own experimentation, plus an introduction to someone else's browser-based NDLOCR-Lite implementation.

OCR NDLOCR NDLOCR-Lite Python Docker Local LLM ONNX WebAssembly Experiment

TechFeb 26, 202613 min

OCR Correction on Showa-Era Documents with NDLOCR-Lite and Local LLMs

Set up the CLI version of NDLOCR-Lite on Apple Silicon Mac, then tested OCR result correction with Qwen 3.5 and Swallow. Includes experiments with direct image reading and the anchoring effect.

OCR Python NDLOCR-Lite Mac Qwen Swallow Ollama Local LLM Experiment

TechFeb 15, 20266 min

Setting Up a Local LLM on the GMKtec EVO-X2 (Strix Halo)

Building an NSFW-capable local LLM on the GMKtec EVO-X2 (Strix Halo). Getting GPU inference at ~11 tokens/s with LM Studio and MS3.2-24B-Magnum-Diamond.

AI LLM Local LLM LM Studio AMD Experiment

TechFeb 4, 20263 min

ACE-Step 1.5: AI Music Generation Gets a Full Architecture Overhaul

ACE-Step V1.5 has been released with a hybrid LM+DiT architecture, 50+ language support, and 4GB VRAM minimum — a major evolution from V1.0.

AI Audio Generation Local LLM